129 research outputs found

    Generalized Insertion Region Guides for Delaunay Mesh Refinement

    Get PDF
    Mesh generation by Delaunay refinement is a widely used technique for constructing guaranteed quality triangular and tetrahedral meshes. The quality guarantees are usually provided in terms of the bounds on circumradius-to-shortest-edge ratio and on the grading of the resulting mesh. Traditionally circumcenters of skinny elements and middle points of boundary faces and edges are used for the positions of inserted points. However, recently variations of the traditional algorithms are being proposed that are designed to achieve certain optimization objectives by inserting new points in neighborhoods of the center points. In this paper we propose a general approach to the selection of point positions by defining one-, two-, and three-dimensional selection regions such that any point insertion strategy based on these regions is automatically endowed with the theoretical guarantees proven here. In particular, for the input models defined by planar linear complexes under the assumption that no input angle is less than 9090^\circ, we prove the termination of the proposed generalized algorithm, as well as the fidelity and good grading of the resulting meshes

    Towards Performance Portable Programming for Distributed Heterogeneous Systems

    Full text link
    Hardware heterogeneity is here to stay for high-performance computing. Large-scale systems are currently equipped with multiple GPU accelerators per compute node and are expected to incorporate more specialized hardware in the future. This shift in the computing ecosystem offers many opportunities for performance improvement; however, it also increases the complexity of programming for such architectures. This work introduces a runtime framework that enables effortless programming for heterogeneous systems while efficiently utilizing hardware resources. The framework is integrated within a distributed and scalable runtime system to facilitate performance portability across heterogeneous nodes. Along with the design, this paper describes the implementation and optimizations performed, achieving up to 300% improvement in a shared memory benchmark and up to 10 times in distributed device communication. Preliminary results indicate that our software incurs low overhead and achieves 40% improvement in a distributed Jacobi proxy application while hiding the idiosyncrasies of the hardware

    Scalable Quantum Edge Detection Method for D-NISQ Imaging Simulations: Use Cases from Nuclear Physics and Medical Image Computing

    Get PDF
    Edge Detection is one of the computationally intensive modules in image analysis. It is used to find important landmarks by identifying a significant change (or “edge”) between pixels and voxels. We present a hybrid Quantum Edge Detection method by improving three aspects of an existing widely referenced implementation, which for our use cases generates incomprehensible results for the type and size of images we are required to process. Our contributions are in the pre- and post-processing (i.e., classical phase) and a quantum edge detection circuit: (1) we use space- filling curves to eliminate image artifacts introduced by the image decomposition, which is required to utilize D-NISQ (Distributed Noisy Intermediate-Scale Quantum) model; (2) we introduce a new decrement permutation circuit and relevant optimizations for mapping realistic images on today’s noise Quantum Processor Unites (QPU); (3) we can improve the encoding circuit fidelity to approximately 70%, reduce the edge detection circuit depth by approximately 11%, and reduce the number of CX gates by approximately 68% to under 100, by using a moderate number of 128 cores for 5-qubit QPUs in the D-NISQ simulations, which are enhanced with realistic noise model from IBM. An evaluation of MRI (Magnetic Resonance Imaging) is underway, and we will report our findings.https://digitalcommons.odu.edu/gradposters2023_gradschool/1000/thumbnail.jp

    Evaluation of Scalable Quantum and Classical Machine Learning for Particle Tracking Classification in Nuclear Physics

    Get PDF
    Future particle accelerators will exceed by far the current data size (1015) per experiment, and high- luminosity program(s) will produce more than 300 times as much data. Classical Machine Learning (ML) likely will benefit from new tools based on quantum computing. Particle track reconstruction is the most computationally intensive process in nuclear physics experiments. A combinatorial approach exhaustively tests track measurements (“hits”), represented as images, to identify those that form an actual particle trajectory, which is then used to reconstruct track parameters necessary for the physics experiment. Quantum Machine Learning (QML) could improve this process in multiple ways, including denoising the data, classifying candidate tracks, and reconstructing the track parameters without conventional processing. We will present our contributions to the candidate track classification problem using a quantum convolutional network, for Noisy Intermediate-Scale Quantum (NISQ) computers: (1) an artifact-free image decomposition of the particle images into sub-images to cope with the 5% fidelity of a 12-qubit circuit required for the encoding of the full image (currently used in classical ML); (2) improve fidelity by 70% utilizing Distributed NISQ (D-NSIQ) model that utilizes 128 cores to run our simulations on 5-qubit Quantum Processor Units with different and realistic noise models; (3) evaluation of D-NISQ QML in terms of accuracy against the 99% precision of an optimized classical convolutional network we published recently.https://digitalcommons.odu.edu/gradposters2023_gradschool/1001/thumbnail.jp

    Multitissue Tetrahedral Image-to-Mesh Conversion with Guaranteed Quality and Fidelity

    Get PDF
    We present a novel algorithm for tetrahedral image-to-mesh conversion which allows for guaranteed bounds on the smallest dihedral angle and on the distance between the boundaries of the mesh and the boundaries of the tissues. The algorithm produces a small number of mesh elements that comply with these bounds. We also describe and evaluate our implementation of the proposed algorithm that is compatible in performance with a state-of-the art Delaunay code, but in addition solves the small dihedral angle problem. Read More: http://epubs.siam.org/doi/10.1137/10081525

    Parallel Anisotropic Unstructured Grid Adaptation

    Get PDF
    Computational Fluid Dynamics (CFD) has become critical to the design and analysis of aerospace vehicles. Parallel grid adaptation that resolves multiple scales with anisotropy is identified as one of the challenges in the CFD Vision 2030 Study to increase the capacity and capability of CFD simulation. The Study also cautions that computer architectures are undergoing a radical change and dramatic increases in algorithm concurrency will be required to exploit full performance. This paper reviews four different methods to parallel anisotropic grid generation. They cover both ends of the spectrum: (i) using existing state-of-the-art software optimized for a single core and modifying it for parallel platforms and (ii) designing and implementing scalable software with incomplete, but rapidly maturating functionality. A brief overview for each grid adaptation system is presented in the context of a telescopic approach for multilevel concurrency. These methods employ different approaches to enable parallel execution, which provides a unique opportunity to illustrate the relative behavior of each approach. Qualitative and quantitative metric evaluations are used to draw lessons for future developments in this critical area for parallel CFD simulation

    A Framework for Parallel Unstructured Grid Generation for Complex Aerodynamic Simulations

    Get PDF
    A framework for parallel unstructured grid generation targeting both shared memory multi-processors and distributed memory architectures is presented. The two fundamental building-blocks of the framework consist of: (1) the Advancing-Partition (AP) method used for domain decomposition and (2) the Advancing Front (AF) method used for mesh generation. Starting from the surface mesh of the computational domain, the AP method is applied recursively to generate a set of sub-domains. Next, the sub-domains are meshed in parallel using the AF method. The recursive nature of domain decomposition naturally maps to a divide-and-conquer algorithm which exhibits inherent parallelism. For the parallel implementation, the Master/Worker pattern is employed to dynamically balance the varying workloads of each task on the set of available CPUs. Performance results by this approach are presented and discussed in detail as well as future work and improvements

    A Machine Learning Approach to Denoising Particle Detector Observations in Nuclear Physics

    Get PDF
    With the evolution in detector technologies and electronic components used in the Nuclear Physics field, experimental setups become larger and more complex. Faster electronics enable particle accelerator experiments to run with higher beam intensity, providing more interactions per time and more particles per interaction. However, the increased beam intensities present a challenge to particle detectors because of the higher amount of noise and uncorrelated signals. Higher noise levels lead to a more challenging particle reconstruction process by increasing the number of combinatorics to analyze and background signals to eliminate. On the other hand, increasing the beam intensity can provide physics outcomes faster if combined with a highly efficient track reconstruction process. Thus, a method that provides efficient tracking under high luminosity conditions can significantly reduce the amount of time required to conduct physics experiments. In this poster, we present a machine learning (ML) approach for denoising data from particle tracking detectors to improve the track reconstruction efficiency of the CLAS12 detector at Jefferson Lab (JLab). A noise-reducing Convolutional Autoencoder was used to process data for standard experimental running conditions and showed significant improvements in track reconstruction efficiency (\u3e15%). The studies were extended to synthetically generated data emulating much higher beam intensity and showed that the ML approach outperforms conventional algorithms, providing a significant increase in track reconstruction efficiency of up to 80%. This tremendous increase in reconstruction efficiency allows experiments to run at almost three times higher luminosity, leading to significant savings in time (about three times shorter) and money. The software developed by this work is now part of the CLASS12 workflow, assisting scientists of JLab and collaborating institutions.https://digitalcommons.odu.edu/gradposters2022_sciences/1003/thumbnail.jp

    Making Sense of Video Analytics: Lessons Learned from Clickstream Interactions, Attitudes, and Learning Outcome in a Video-Assisted Course

    Get PDF
    Online video lectures have been considered an instructional media for various pedagogic approaches, such as the flipped classroom and open online courses. In comparison to other instructional media, online video affords the opportunity for recording student clickstream patterns within a video lecture. Video analytics within lecture videos may provide insights into student learning performance and inform the improvement of video-assisted teaching tactics. Nevertheless, video analytics are not accessible to learning stakeholders, such as researchers and educators, mainly because online video platforms do not broadly share the interactions of the users with their systems. For this purpose, we have designed an open-access video analytics system for use in a video-assisted course. In this paper, we present a longitudinal study, which provides valuable insights through the lens of the collected video analytics. In particular, we found that there is a relationship between video navigation (repeated views) and the level of cognition/thinking required for a specific video segment. Our results indicated that learning performance progress was slightly improved and stabilized after the third week of the video-assisted course. We also found that attitudes regarding easiness, usability, usefulness, and acceptance of this type of course remained at the same levels throughout the course. Finally, we triangulate analytics from diverse sources, discuss them, and provide the lessons learned for further development and refinement of video-assisted courses and practices

    Extreme-Scale Parallel Mesh Generation: Telescopic Approach

    Get PDF
    In this poster we focus and present our preliminary results pertinent to the integration of multiple parallel Delaunay mesh generation methods into a coherent hierarchical framework. The goal of this project is to study our telescopic approach and to develop Delaunay-based methods to explore concurrency at all hardware layers using abstractions at (a) medium-grain level for many cores within a single chip and (b) coarse-grain level, i.e., sub-domain level using proper error metric- and application-specific continuous decomposition methods
    corecore